English-Korean Machine Transliteration by Combining Statistical Model and Web Search
نویسندگان
چکیده
Machine transliteration is an automatic method for translating source language words into phonetically equivalent target language ones. Many previous methods were devoted to translating the word that only traces phonological phenomena of the source language and the resulting showed good performance. However, there are a lot of names originated from not only the source language but also non-source languages. The existing methods fail in showing high accuracy when the names comes from the non-source language since they focus on names in source language. To deal with this problem, this paper describes a hybrid method which combines statistical model and web search for improving machine transliteration performance. The proposed method constructs a base system that stands on a statistical model to produce candidates, then expands candidates from web documents. With these candidates, it finds the most appropriate answer without any external resources. The experimental results present that the proposed method achieves higher performance than statistical model and web search respectively.
منابع مشابه
Machine Transliteration Using Multiple Transliteration Engines and Hypothesis Re-Ranking
This paper describes a novel method of improving machine transliteration by using multiple transliteration hypotheses and re-ranking them. We constructed seven machine-transliteration engines to produce a set of transliteration hypotheses. We then re-ranked the hypotheses to select the correct transliteration hypothesis. We propose a re-ranking method that makes use of confidence-score, languag...
متن کاملA Log-Linear Block Transliteration Model based on Bi-Stream HMMs
We propose a novel HMM-based framework to accurately transliterate unseen named entities. The framework leverages features in letteralignment and letter n-gram pairs learned from available bilingual dictionaries. Letter-classes, such as vowels/non-vowels, are integrated to further improve transliteration accuracy. The proposed transliteration system is applied to out-of-vocabulary named-entitie...
متن کاملHypothesis Selection in Machine Transliteration: A Web Mining Approach
We propose a new method of selecting hypotheses for machine transliteration. We generate a set of Chinese, Japanese, and Korean transliteration hypotheses for a given English word. We then use the set of transliteration hypotheses as a guide to finding relevant Web pages and mining contextual information for the transliteration hypotheses from the Web page. Finally, we use the mined information...
متن کاملA Hybrid Approach to English-Korean Name Transliteration
This paper presents a hybrid approach to English-Korean name transliteration. The base system is built on MOSES with enabled factored translation features. We expand the base system by combining with various transliteration methods including a Web-based n-best re-ranking, a dictionary-based method, and a rule-based method. Our standard run and best nonstandard run achieve 45.1 and 78.5, respect...
متن کاملNCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches
This paper describes our approach to English-Korean and English-Chinese transliteration task of NEWS 2015. We use different grapheme segmentation approaches on source and target languages to train several transliteration models based on the M2M-aligner and DirecTL+, a string transduction model. Then, we use two reranking techniques based on string similarity and web co-occurrence to select the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011